18 research outputs found
SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods
In the last few years thousands of scientific papers have investigated
sentiment analysis, several startups that measure opinions on real data have
emerged and a number of innovative products related to this theme have been
developed. There are multiple methods for measuring sentiments, including
lexical-based and supervised machine learning methods. Despite the vast
interest on the theme and wide popularity of some methods, it is unclear which
one is better for identifying the polarity (i.e., positive or negative) of a
message. Accordingly, there is a strong need to conduct a thorough
apple-to-apple comparison of sentiment analysis methods, \textit{as they are
used in practice}, across multiple datasets originated from different data
sources. Such a comparison is key for understanding the potential limitations,
advantages, and disadvantages of popular methods. This article aims at filling
this gap by presenting a benchmark comparison of twenty-four popular sentiment
analysis methods (which we call the state-of-the-practice methods). Our
evaluation is based on a benchmark of eighteen labeled datasets, covering
messages posted on social networks, movie and product reviews, as well as
opinions and comments in news articles. Our results highlight the extent to
which the prediction performance of these methods varies considerably across
datasets. Aiming at boosting the development of this research area, we open the
methods' codes and datasets used in this article, deploying them in a benchmark
system, which provides an open API for accessing and comparing sentence-level
sentiment analysis methods
Mídias Sociais e Administração Pública: Análise do sentimento social perante a atuação do governo federal brasileiro
Este estudo procurou identificar como a análise de sentimento, baseada em textos extraídos de mídias sociais, pode ser um instrumento de mensuração da opinião pública sobre a atuação do governo de forma a contribuir para a avaliação da administração pública. Trata-se de um estudo aplicado, interdisciplinar, exploratório, qualitativo e quantitativo. Foram revisadas as principais formulações teóricas e conceituais acerca do tema e realizadas demonstrações práticas, utilizando-se uma ferramenta de mineração de opinião que proporcionou precisão satisfatória no processamento de dados. Para fins de demonstração, foram selecionados temas que motivaram a realização da onda de protestos que envolveu milhões de pessoas no Brasil em junho de 2013. Foram coletadas, processadas e analisadas, aproximadamente, 130.000 mensagens postadas no Facebook e no Twitter sobre esses temas em dois períodos distintos. Por meio desta investigação, observou-se que a análise de sentimento pode revelar a opinião polarizada dos cidadãos quanto à atuação do governo
A relativistic opinion mining approach to detect factual or opinionated news sources
19th International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2017; Lyon; France; 28 August 2017 through 31 August 2017The credibility of news cannot be isolated from that of its source. Further, it is mainly associated with a news source’s trustworthiness and expertise. In an effort to measure the trustworthiness of a news source, the factor of “is factual or opinionated” must be considered among others. In this work, we propose an unsupervised probabilistic lexicon-based opinion mining approach to describe a news source as “being factual or opinionated”. We get words’ positive, negative, and objective scores from a sentiment lexicon and normalize these scores through the use of their cumulative distribution. The idea behind the use of such a statistical approach is inspired from the relativism that each word is evaluated with its difference from the average word. In order to test the effectiveness of the approach, three different news sources are chosen. They are editorials, New York Times articles, and Reuters articles, which differ in their characteristic of being opinionated. Thus, the experimental validation is done by the analysis of variance on these different groups of news. The results prove that our technique can distinguish the news articles from these groups with respect to “being factual or opinionated” in a statistically significant way.Scientific and Technological Research Council of Turkey under contract number 114E78
Predicting Contradiction Intensity: Low, Strong or Very Strong?
International audienceReviews on web resources (e.g. courses, movies) become increasingly exploited in text analysis tasks (e.g. opinion detection, controversy detection). This paper investigates contradiction intensity in reviews exploiting different features such as variation of ratings and variation of polarities around specific entities (e.g. aspects, topics). Firstly, aspects are identified according to the distributions of the emotional terms in the vicinity of the most frequent nouns in the reviews collection. Secondly, the polarity of each review segment containing an aspect is estimated. Only resources containing these aspects with opposite polarities are considered. Finally, some features are evaluated, using feature selection algorithms, to determine their impact on the effectiveness of contradiction intensity detection. The selected features are used to learn some state-of-the-art learning approaches. The experiments are conducted on the Massive Open Online Courses data set containing 2244 courses and their 73,873 reviews, collected from coursera.org. Results showed that variation of ratings, variation of polarities, and reviews quantity are the best predictors of contradiction intensity. Also, J48 was the most effective learning approach for this type of classification
Combining sentiment analysis scores to improve accuracy of polarity classification in MOOC posts
Sentiment analysis is a set of techniques that deal with the verification of sentiment and emotions in written texts. This introductory work aims to explore the combination of scores and polarities of sentiments (positive, neutral and negative) provided by different sentiment analysis tools. The goal is to generate a final score and its respective polarity from the normalization and arithmetic average scores given by those tools that provide a minimum of reliability. The texts analyzed to test our hypotheses were obtained from forum posts from participants in a massive open online course (MOOC) offered by Universidade Aberta de Portugal, and were submitted to four online service APIs offering
sentiment analysis: Amazon Comprehend, Google Natural Language, IBM Watson Natural Language Understanding, and Microsoft Text Analytics. The initial results are encouraging, suggesting that the average score is a valid way to increase the accuracy of the predictions from different sentiment analyzers.info:eu-repo/semantics/publishedVersio